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Abstract 

Do today’s communication technologies hold potential 
to alleviate poverty? The mobile phone’s accessibility 
and use allows us with an unprecedented volume of data 
on social interactions, mobility and more. Can this data 
help us better understand, characterize and alleviate 
poverty in one of the poorest nations in the world. Our 
study is an attempt in this direction. We discuss two 
concepts, which are both interconnected and immensely 
useful for securing the important link between mobile 
accessibility and poverty. 

First, we use the cellular-communications data to 
construct virtual connectivity maps for Senegal, which 
are then correlated with the poverty indicators to learn a 
model. Our model predicts poverty index at any spatial 
resolution. Thus, we generate Poverty Maps for Senegal 
at an unprecedented finer resolution. Such maps are 
essential for understanding what characterizes poverty 
in a certain region, and how it differentiates from other 
regions, for targeted responses for the demographic of 
the population that is most needy. An interesting fact, 
that is empirically proved by our methodology, is that 
a large portion of all communication, and economic 
activity in Senegal is concentrated in Dakar, leaving 
many other regions marginalized. 

Second, we study how user behavioral statistics, 
gathered from cellular-communications, correlate with 
the poverty indicators. Can this relationship be learnt 
as a model to generate poverty maps at a finer resolu¬ 
tion? Surprisingly, this relationship can give us an al¬ 
ternate poverty map, that is solely based on the user 
behavior. Since poverty is a complex phenomenon, 
poverty maps showcasing multiple perspectives, such as 
ours, provide policymakers with better insights for ef¬ 
fective responses for poverty eradication. 


1 Introduction and Motivation 

According to the United Nations Development Pro¬ 
gram’s 2014 Human Development Index (HDI), Senegal 
is ranked 163 out of 187 countries with an HDI index of 
0.485. HDI measures achievement in three basic dimen¬ 
sions of human development: health, knowledge, and 
standard of living. Senegal has a population of 14.1 
million, with 43.1% urban population, and the median 
age of 18.2 years. It is one of the poorest country in 
the world, with over 9.2 million people living in multi¬ 
dimensional poverty. Wealth distribution in Senegal is 
very unequal. 

Poverty incidence remains high, affecting about 
47% of the population. There are wide disparities 
between poverty in rural areas (at 57%)s, and urban 
areas, where the poverty rate is 33%. More than 42% 
of the population lives in rural areas, with a population 
density that varies from 77 people per square kilometer 
to 2 people per square kilometer in the dry regions of 
the country. 

On the other hand, the growth in mobile-cellular 
technology has been very impressive in recent decades. 
It is estimated that there are 95 mobile-cellular tele¬ 
phone subscriptions per 100 inhabitants worldwide [11. 
In Senegal, there are 93 mobile phone subscriptions 
per 100 people, according to the latest world-bank re¬ 
port [2]. 

The power of growth of mobile technology poses 
a question: Can their accessibility be used to identify, 
characterize, and, in turn, alleviate poverty? Ours is 
a case study towards answering this question. An ex¬ 
pected outcome is a high resolution poverty map of 
Senegal, and its poverty analysis, with some recommen¬ 
dations for effective policies for an inclusive growth. We 
believe that such poverty analysis with the growth of 
virtual mobility will be beneficial to a developing econ- 


omy like Senegal. 


dispersal of services. 


1.1 Poverty Maps Currently the poverty maps are 
created using nationally representative household sur¬ 
veys, which requires a lot of man-power, and time, and 
continues to lag for Sub-Saharan Africa compared to the 
world [3]. The data is updated yearly, and assessed for 
poverty progress in 3 years. 

Poverty has traditionally been measured in one di¬ 
mension, usually income or consumption, called income 
poverty. Another internationally comparable poverty 
measure is the World Banks $1.25 per day, which iden¬ 
tifies people who do not reach the minimum income 
poverty line. 

In 2010, the Oxford Poverty and Human Develop¬ 
ment Initiative (OPHI) launched a Multidimensional 
Poverty Index (MPI). It is a composite of 10 indicators 
across three areas - education (years of schooling, school 
enrollment), health (malnutrition, child mortality), and 
living conditions. 

We use MPI for our poverty analysis, since it 
closely aligns with the Human Development Index, and 
is widely accepted to study poverty. MPI is robust 
to decomposition within relevant sub-groups of pop¬ 
ulations, like urban vs rural, geographic regions (dis¬ 
tricts/provinces/states), religion and ethnicity, gender; 
so that targeted policies can be planned for specific de¬ 
mographics. 

The MPI data is available at region level for each 
of the 14 regions in Senegal. Figure [l] depicts the latest 
(2011) poverty map of Senegal. 

2 Contributions 

The main contributions of our work are two-fold. First 
we construct a virtual network for Senegal from cellular- 
communication data (Dataset 1), identify network- 
theoritic measures that correlate well with poverty in¬ 
dicators, learn a model that predicts poverty at a finer 
resolution, and finally build a poverty map for Senegal 
at an arrondissement level. Second, we learn a model 
solely based on the relationship of aggregated user be¬ 
havior (Dataset 3) with poverty indicators, and generate 
a poverty map at a finer resolution. 

Here are detailed technical contributions of our 
work: 

• We construct a virtual network for Senegal, which 
is defined as a who-calls-whom network from the 
mobile communication data. Intuitively, a virtual 
network quantifies the mobile connectivity, and ac¬ 
cessibility to the population. It signifies the macro¬ 
level view of connections or social ties between peo¬ 
ple, dissemination of information or knowledge, or 


• We study Senegal’s virtual network to empirically 
get the most important spatial regions. We assign 
each region a unique score based on its importance 
in the virtual network. We find that a network 
theoretic based measure, called centrality, provides 
a strong correlation with the poverty index. The 
more important the region, the less poor it is on 
the poverty index. 

• As MPI is a composite index, we find how well 
each of the component of MPI correlate with the 
importance of the regions. 

• We apply linear regression to learn a relation¬ 
ship between centrality and the components of the 
poverty indicators. Our model is, then, used to es¬ 
timate the poverty at a finer spatial resolution of 
arrondissements. We show our predicted region- 
level poverty map, and validate that it correlates 
well with the the true poverty map. 

• We provide an in-depth region-level analysis of the 
correlation and poverty, and explain the bias caused 
by the Dakar region as it has very high centrality 
and very low poverty. 

• We also attempt to understand, and characterize 
poverty. Since regions may be poor because of vari¬ 
ous reasons - information poor, but resource rich; or 
resource poor, but information rich. To study this, 
we use the behavior indicators of the users provided 
in Dataset 3. Some of these indicators are: entropy 
of contacts, percentage of calls from home, radius 
of gyration, etc. These indicators characterize in¬ 
dividual users based on how they call, move, and 
interact within the cellular network. We studied 
their correlation with the poverty index and inter¬ 
estingly, found that the correlation was not biased 
by Dakar. Using one of such indicators, which has 
a very strong negative correlation with the poverty 
index, we learn a model for predicting MPI and 
construct an arrondissement level poverty map for 
Senegal. 

3 Related Work 

Call data records (CDR) allow a view of the communi¬ 
cation and mobility patterns of people at an unprece¬ 
dented scale. In the past, several researchers have used 
CDR data to understand human mobility 015 ]. How¬ 
ever, there has been limited work in understanding rela¬ 
tion between CDR data and poverty mmmm- Eagle 
et al. m correlated the diversity in communication with 
socioeconomic deprivation and found a strong positive 
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Figure 1: Overlay of Multidimensional Poverty Index (MPI) and cell phone site locations for 14 regions of Senegal. 


correlation between the diversity in calling patterns and 
socioeconomic deprivation in England. The closest work 
to that presented here is by Smith et al. m who have 
calculated a number of features like introversion, diver¬ 
sity, residuals, and activity to find their correlations be¬ 
tween poverty index (MPI) of Cote d’Ivoire, and further 
build a finer granularity poverty index based on the fea¬ 
ture that gave the best correlations. Our work is dif¬ 
ferent in the following aspects: a), we study the virtual 
communication network using a rigorous network sci¬ 
ence approach and find that centrality based measures 
which focus on the importance or influence of nodes pro¬ 
vide a better correlation with the poverty index, b). we 
treat MPI as a composite index and provide estimate 
models that predict the individual components of the 
index, c). we provide an in-depth region wise analysis 
of the correlation and discover the bias caused by the 
Dakar region because of its unique nature in Senegal, 
and d). we also compare the finer level poverty maps 
generated using the network centric approach with a hu¬ 
man behavior based method. Our work on relating hu¬ 
man behavior with MPI is motivated from work done in 
the past in which behavior indicators are extracted from 
CDR data and used to predict the socioeconomic indica¬ 
tors of a region mm- Specifically, Soto et al. [12] have 
proposed a Support Vector Machine model which uses 
279 features (calling behavioral, mobility, and social) 
extracted from an individual users CDR to predict the 
socioeconomic levels at a census region. However, the 


predictive model requires knowledge of finer granular¬ 
ity poverty data and partial knowledge of a user’s home 
information. Instead, we use the 33 indicators provided 
in the D4D challenge data and provide a methodology 
to correlate the indicators at region and arrondissement 
level without re quiring additional information about 
the users. 

4 Senegal’s Virtual Network 

We define the physical network of a country, as com¬ 
posed of transportation landscape like road, railways, 
ports, which are the nation’s arteries that fuel its eco¬ 
nomic growth. With the burgeoning growth of mobile 
communication, we define a virtual network of a coun¬ 
try, that describes who-calls-whom network. Calls are 
placed for a variety of reasons including request of re¬ 
sources, information dissemination, personal. The call 
data records (CDR), provided by the Orange, provides 
an interesting way to characterize and understand the 
virtual network of Senegal. 

While the physical network determines how people 
move, and goods are transported, virtual network deter¬ 
mines how information or knowledge flows. Currently, 
a good portion of the information and services are dis¬ 
persed virtually. While the physical network is limited 
by the inherent capacity of the roads, and railway net¬ 
work, the virtual network is dynamic. Due to people’s 
mobility across spatial regions, and the ubiquitous na¬ 
ture of cellular technology, both physical and virtual 






networks interact creating complex dynamism. For a 
holistic understanding of any complex phenomenon, we 
need to understand both the networks. 

Static maps are easy to get, but how to get the 
virtual network. Existing gravity models can provide an 
estimate of the flow (with the knowledge of a constant), 
however such estimates are static and over-reliant on 
the spatial proximity between sites. The CDR data, 
however, provides the actual measure of the information 
flow at a finer spatial and temporal resolution. We 
construct a virtual network of Senegal from the CDR 
data. Such a network is generic, and can be used for 
understanding multiple phenomenon involving dynamic 
interactions with the physical network, like e-health 
(while the physical network determines where disease 
spreads next, virtual network determines how it can 
be contained by proper dissemination of preventive 
knowledge), e-education, and e-commerce. 

4.1 How to construct the Virtual Network To 

construct the virtual network, we need two entities: spa¬ 
tial regions, where calls are originated from or are re¬ 
ceived in; and virtual paths that signify communication 
among them. 

In virtual network for Senegal, the spatial regions 
correspond to administrative areas (that can be ar- 
rondissement, departments or regions), and the virtual 
path between each pair of nodes corresponds to the 
volume of mobile communication (number of calls and 
texts) between them during the whole year of 2013. 

For this study, we used the hourly antenna-to- 
antenna traffic available for 1666 cell phone towers 
(sites) to measure communication between sites for 2013 
(Dataset 1), as follows: 

• Create an information flow matrix at site-level M s 
with 1666 rows and 1666 columns, such that the 
entry M?- denotes the number of calls and texts 
exchanged between site i and j during the whole 
year. Each entry M^- represents the calls and texts 
originated at site i and received at site j. 

• To get the arrondissement level information flow, 
we “coarsen” the site to site matrix into a 124 x 124 
matrix M a , such that the entry Mg- denotes the 
total number calls and texts originated at all sites 
in arrondissement i and received at all sites in 
arrondissement j. 

• To get the region level information flow, we 
“coarsen” the arrondissement to arrondissement 
matrix into a 14 x 14 matrix M r , such that the 
entry Mg denotes the total number calls and texts 
originated from all sites in region i and received at 
all sites in region j. 


The resulting virtual network is shown in Figure [2] 
On the map, each region is depicted by the latitude and 
longitude of its geographical centroid. In the graph, the 
size of each node denotes the total number of incoming 
and outgoing calls and texts for the region for the entire 
year. The thickness of the link indicates the volume 
of calls and texts exchanged between the corresponding 
pair of regions. Looking at the map, we see that regions, 
e.g., Dakar, Thies and Ziguinchor, which have low 
MPI are important nodes in the virtual communication 
network. On the other hand, regions with high MPI, 
e.g., Kaffrine and Kolda are not well connected with 
other regions. However, there are regions that are well- 
connected but are poor (e.g., Tambacounda) and regions 
which are poorly connected but are relatively less poor 
(e.g., Kedougou). This indicates that poverty is a 
complex phenomenon and needs to be understood from 
multiple perspectives like relationships with bordering 
countries, unique geographical settings, etc. 

4.2 Virtual Network and Poverty Analysis Fig¬ 
ure [2] shows a relationship between the importance of 
a region in the virtual network with the poverty index. 
This motivates us to find a quantitative measure of the 
importance of the region. 

As a standard nomenclature, in a network the 
spatial regions are called nodes, and virtual paths 
are called edges. Network analysis involves extracting 
some quantitative measures or properties, associated 
with the structure of the network, nodes and/or edges. 
A popular measure is the relative importance of the 
nodes in the network. This measure is referred to as 
centrality m- It identifies the most important nodes in 
a network, and assigns a quantitative score to each node. 
Since importance can have many definitions ranging 
from nodes being central or cohesive, there are several 
centrality measures for networks. To calculate the 
measure we normalize the raw communication matrix 
(M r ) as discussed below. 

4.2.1 Normalization of Information Flow Ma¬ 
trix While previous researchers El have used the in¬ 
formation flow matrix M r to characterize the virtual 
network, we first normalize the matrix to account for 
the disparity in the population across regions, especially 
the strong influence of Dakar owing to its relatively high 
population share ( 22.9%). We construct a normalized 
information matrix M r as follows: 


where ni (and nj) is the number of cell-phone towers in 
region i (and j ) and dij is the as-the-crow-flies distance 




Figure 2: Virtual network for Senegal at Region-level with MPI (Multi-dimensional Poverty) as an overlay. 
Thickness of links indicate the volume of calls and texts exchanged between a pair of regions. Size of the circle at 
each region indicates the total number of incoming and outgoing calls and texts for the region. Note that regions 
with plenty of strong links have lower poverty, while poor regions look isolated. 


between the centroids of regions i and j. We normalize 
the matrix M a in a similar way. 

We use the number of sites (rii) as an indicator of 
the size of the calling population in the given region. 
Thus, the value 1 ^ ± in ( |4.1| ) is proportional to the 
expected number of calls between two regions, which 
is very similar to the well-known Gravity model , which 
has been used in the past to predict the intensity 
of mobile phone calls between cities [9] ( numcalls oc 
^r-). However, instead of using the population of 

the two regions (pi and pj), we use the number of 
sites to get a estimate of the population with mobile 
phones. We set the exponent a for distance as 1 as 
it gave the best correlation with poverty level. The 
normalization procedure removes the impact of regional 
population and spatial distance on the information flow 
and measures the residual flow , assuming that all region 
have equal population and are equidistant from each 
other. 

4.2.2 Measures of Importance and their Cor¬ 
relation with Poverty Besides graph-theoretic mea¬ 
sures, we also investigated direct features that can be 
calculated from the raw communication matrix (M r ) or 
the normalized communication matrix (M r ) discussed 
as follows. 


• Activity: This feature is a simple aggregate of 

outgoing flows from a region (= 'Yh%±j MJj for 
region i). We also used a similar feature derived 
from the normalized matrix (= ^ij f° r re gi° n 

i). Additionally, we investigated other variants 
such as the count of incoming flows , within flows , 
and total flows and found similar relationships with 
the poverty indicators. 

• Eigen Vector and Page Rank Centrality: This 
is derived from the normalized matrix M r ,and is a 
measure of the influence of a node in a graph. Eigen 
vector centrality of a node, u*, is a weighted sum of 
centralities of all of its outgoing connections: 

(4.2) x i = jY / Mt j x j 

3 

where A is some constant. In matrix notation, this 
can be written as Ax = M r x, such that x is the 
eigenvector of the matrix M r corresponding to the 
leading eigenvalue. 

Page Rank is a variant of eigen vector centrality 
and is widely used for ranking websites by search 
engines such as Google. However, the actual role 
of Page Rank is to rank nodes in a network based 
on their importance. It has been noted that the 





Measure 

H 

A 

MPI 

corr 

p -value 

corr 

p -value 

corr 

p -value 

PageRank 

-0.87 

6 x 1 CT 5 

-0.81 

0.0004 

-0.82 

0.0003 

Eigenvalue Centrality 

-0.83 

0.0002 

-0.80 

0.0005 

-0.79 

0.0007 

Gravity Residual 

-0.81 

0.0003 

-0.76 

0.0015 

-0.79 

0.0007 

Introversion 

0.82 

0.0002 

0.70 

0.0040 

0.79 

0.0006 

Activity ( Normalized ) 

-0.81 

0.0008 

-0.76 

0.0003 

-0.79 

0.0015 

Activity (Raw) 

-0.80 

0.0006 

-0.68 

0.0075 

-0.71 

0.0040 


Table 1: Pearson’s r Correlation of region-wise poverty indicators with communication graph features. H - Incidence of 
Poverty, A - Average Intensity Across the Poor, MPI - Multidimensional Poverty Index. 


classic eigen vector centrality (See ( |4.2| )) performs 
poorly for directed networks while the Page Rank 
measure can handle directed networks better. 




Gravity Residual: As shown in (4.1), each entry 


of the normalized matrix M r measures the “resid¬ 
ual” from node i to j after normalizing for popu¬ 
lation and spatial distance. We compute the total 
outgoing residual flow from each node as: 


(4.3) Residuali = £ Mij 

3 


In the past similar measures have been shown 
to correlate negatively with MPI, indicating that 
regions that communicate more are less poor. 


• Introversion: This measures the tendency of the 
population within a region to communicate within 
the region instead of outside. The introversion 
measure can be calculated as: 


(4.4) 


Introversiorii = 



All the above measures give a score for each region in 
Senegal, based on its relative importance. Further, we 
study how these measures correlate with the poverty in¬ 
dex of the regions. The MPI reflects both the incidence 
or headcount ratio ( H ) of poverty, i.e., the proportion 
of the population that is multidimensionally poor and 
the average intensity (A) of their poverty, i.e., the av¬ 
erage proportion of indicators in which poor people are 
deprived. The MPI is calculated by multiplying the inci¬ 
dence of poverty by the average intensity across the poor 
(H x A). Hence, we study the correlation of the network 
features with H , A , and MPI. Table [l] shows the Pear¬ 
son’s r correlation and the corresponding p- values. We 
observe that the various metrics have a strong negative 
correlation with the H value, which is the headcount 
ratio of poverty. Similarly, the metrics have a marked 
negative correlation with A , which is the incidence of 
poor, and also with MPI of the regions. 


4.2.3 Strong influence of Dakar region Although 
pagerank exhibits strong correlation with the indicators 
H, A, and MPI, when we plot the correlation (see Fig¬ 
ure [3| we see that Dakar has a very unique characteristic 
of very high centrality, and very low MPI, and occupies 
a corner in the scatter plot, whereas all other regions 
are spread at the other corner, with mid-to-high MPI 
and low-to-mid pagerank. We, then, remove Dakar, to 
see its effect on the correlation of pagerank with the 
poverty indicators. Surprisingly, we lose the high cor¬ 
relation between pagerank and poverty indicators when 
Dakar is removed. This is evident from Table 12 The 
correlation drops significantly with high p-values. 

We attribute this to its geopolitical heritage and 
past history as a port during the colonial times. It 
is the largest city with 2.47 million people, followed 
closely by Grand Dakar at 2.35 million. There has 
been excessive economic activity in Dakar, which makes 
up more than half of the Senegalese economy in less 
than 1 percent of the national territory. But sustained 
economic development, there needs to be de-centralized 
development focusing on marginalized areas. This fact 
is also validated by the International Monetary Fund’s 
2013 report on Senegal [4]. 

4.2.4 Generating Finer Resolution Poverty 
Maps To illustrate how to derive finer resolution 
poverty maps (at department or arrondissement levels), 
we use pagerank from Table [l] to predict H and A. We 
learn two linear models using ordinary least squares re¬ 
gression to predict the Incidence of Poverty ( H ) and 
Average Intensity across Poor (A). The learnt models 
are: 

(4.5) Hi = -708.32 x PageRanki + 131.94 

(4.6) Ai = —346.66 x PageRanki + 84.58 

Finally, we combine the two estimates to predict the 
MPI as: 


















(4.7) 


MPIi = — x 
100 


A 

100 


The estimated model for predicted MPI is shown in 
Figure [4j Using this model, we can estimate the MPI 
at a finer spatial resolution, as long as we can compute 
the pagerank for the target spatial areas. 



PageRank 


Figure 4: Estimated model for predicting MPI using the 
Page Rank feature. 


The region level predicted MPI map is shown in 
Figure [5] Note its similarity with the true MPI Map of 
Senegal in Figure [l] 

Figure [6] motives the need for finer granularity 
poverty maps than regions. We can observe a significant 
variability in the centrality measure across arrondisse- 
ments within the same region. This indicates that a 
region has varying levels of poverty. For targeted distri¬ 
bution of economic resources, we need finer level poverty 
maps than regions. 

To generate the arrondissement level poverty map, 
we first ^compute its Page Rank using the normalized 
matrix M a . 

Then we use the models in (4.5)—(4.7) to predict the 
MPI for each arrondissement. The predicted poverty 
map is shown in Figure [7J It is interesting to see that 
regions are composed of arrondissements with varying 
poverty index. 


5 Correlating Behavioral Indicators of users 
and Poverty 

In this section, we study how user behavioral statistics, 
gathered from cellular-communications, correlate with 
the poverty indicators. Can this relationship be learnt 
as a model to generate poverty maps at a finer resolu¬ 
tion? 


As previous researchers have shown nm human 
behavioral information extracted from CDR data can 
be used to measure the socio-economic development of 
a region. We study the relationship of several human 
behavior indicators extracted from CDR data with MPI 
with the goal of identifying key indicators which can 
then be used to predict MPI at a finer spatial resolution. 

5.1 Data For this study we use the one year of 
coarse-grained mobility data available at arrondisse¬ 
ment level for 146,352 users (referred to as Set3 data). 
For each user, the data records the location (at ar¬ 
rondissement level) and time (at hourly level) at which 
the user makes a call or sends a text. Additionally, 
the data also contains a monthly set of 33 behavioral 
indicators which capture calling/texting patterns (14), 
mobility patterns (6), and social behavior (13) of each 
user. 


5.2 Aggregating User Behavior For each user, we 
compute the median of the 12 monthly values, for each 
of the 33 indicators. To relate these individual level 
indicators with region level MPI data, we need to assign 
a “home region” to each user. This information is not 
provided in the data set. We employ the following 
localization procedure to assign an arrondissement (and 
a region) to each user in the sample of 146,352. 

5.2.1 Localization of Users For each user we con¬ 
sider the calls made between 8 PM and 12 PM on each 
of the 365 days of the year. We measure the following 
quantities: 

1. df. Fraction of days (out of 365) the user i made 
at least one call between 8 PM and 12 PM in the 
whole year. 

2. di'. The integer id (between 1 and 123) of the ar¬ 
rondissement that the user i called most frequently 
from during those hours. 

3. cy Fraction of total calls made by the user i 
between 8 PM and 12 PM from the arrondissement 

CL {. 

The arrondissement ai is assigned as the “home ar¬ 
rondissement” for user i and the corresponding region 
is the “home region”. We filter out individuals with in¬ 
sufficient (low di) or ambiguous (low cq) information by 
ignoring all users for whom di < 0.5 and q < 0.95, i.e., 
we only considered those users who made a call in the 
night at least half of the days in the year and who called 
from a single arrondissement 95% of the time. After fil¬ 
tering the sample contained 33,323 individuals (23% of 
the original sample). 







MPJ 



(a) With Dakar (b) Without Dakar 

Figure 3: Illustrating influence of Dakar on the relationship between MPI and Page Rank for regions. 


Measure 

H 

A 

MPI 

corr 

p -value 

corr 

p -value 

corr 

p -value 

PageRank with Dakar 

-0.87 

6 x 10“ 5 

-0.81 

0.0004 

-0.82 

0.0003 

PageRank without Dakar 

-0.68 

0.01 

-0.65 

0.016 

-0.64 

0.018 


Table 2: Pearson’s r Correlation of if, A and MPI with Pagerank of the regions considering Dakar and NOT considering 
Dakar. 
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Figure 5: Predicted region level Poverty map using the virtual network. 
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Figure 6: Visual depiction of what happens when MPI is calculated at a region level. All arrondissements within 
each region are assigned the same poverty, but they have varying centrality measures, signifying importance! 
Thus, need to generate finer povery maps for targeted eradication of poverty. The dashed box in the top panel is 
expanded in the bottom panel. 
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Figure 7: Predicted Arrondissement level Poverty map using the virtual network. 

















To verify that the filtered sample represents the 
entire country we compare the region wise distribution 
of the individuals with true 2011 population share and 
the region wise share of the number of cell phone 
antenna sites in Figure [8] We observe that while 
our filtered set of users oversamples from some of 
better developed regions of Senegal, the distribution 
approximates the population as well as number of sites 
(which in turn is an indicator of the number of mobile 
users) across regions. 


users belonging to regions with higher MPI. Similar 
to previous approach, we found the linear regression 
models to predict incidence of poverty (iJ), average 
intensity across poor (A) and eventually, the MPI for a 
given geographical region. The parameters of the linear 
models are: 

(5.8) Hi = -302.65 x PIQ + 119.35 

(5.9) Ai = -151.53 x PIQ + 78.84 

The MPI is calculated by multiplying the two estimates 
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Figure 9: Estimated model for predicting MPI using the 
Page Rank feature. 


Figure 8: Comparing the region-level population share 
for the indicator sample and the true population. 


5.2.2 Region-Level Behavioral Indicators To 

compute the indicators for each region, we consider all 
users assigned to that region. For each indicator, we 
compute the median value for each indicator. Thus we 
obtain 33 median indicators for each region. 

5.3 Aggregated Behavioral Indicators and 
Poverty Analysis For each indicator we compute the 
Pearson’s r correlation between the region level median 
value for that indicator and MPI. Out of 33 indicators, 
11 had an absolute correlation of 0.90 or greater with 
p -value < 0.00001. We chose one of these indicator vari¬ 
ables with the strongest correlation with MPI - Percent¬ 
age Initiated Conversation (PIC). PIC had a negative 
correlation of -0.93 (p -value = 2 x 10 -06 ) with MPI. Ad¬ 
ditionally, this indicator (as well as all other indicators) 
were not significantly influenced by Dakar (correlation 
without Dakar = -0.89, p -value = 4 x 10 -05 ). 

Result for PIC indicate that in regions with low 
MPI users tend to initiate more call/texts than the 


(See (4.7)). The estimated model for MPI using (5.8) 
and (5.9) is shown in Figure [9] Using a similar 
procedure as discussed in previous section, we generate 
an arrondissement level poverty map for Senegal as 
shown in Figure [l0| 


6 Conclusions 

We analyze the virtual network for Senegal, constructed 
from call data records (CDR) in the context of under¬ 
standing poverty. We propose a novel methodology to 
construct such networks at varying spatial resolutions, 
such as regions or arrondissements. We apply network 
centric methods, such as centrality, to measure the im¬ 
portance of each node in the virtual network, where the 
node either corresponds to one of the 14 regions or 123 
arrondissements in Senegal. We show strong correlation 
of centrality and other measures with the poverty index 
of the region level nodes. 

Since Multi-dimensional Poverty Index (MPI) as a 
composite of two individual indices, we learn a model 
that correlates poverty with each of the indicators. This 
allows us to learn a better relationship between the 
network centric measures and MPI. We provide an in- 
depth region-level analysis of the correlation between 
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Figure 10: Predicted arrondissement-level map of MPI for Senegal using Behavioral Indicators. 


centrality and MPI and discover a bias induced by the 
Dakar region and further analyze the cause of such bias. 
We provide an approach to utilize the user behavioral 
indicator data to understand their relationship with the 
MPI. This is the first time such analysis has been done 
to understand MPI. Through our analysis we discover 
indicators which are not only strongly correlated with 
MPI at region level (0.92 Pearson’s r correlation) but 
also are not biased by any particular region, as was 
observed for the centrality measures. 

Since poverty is a complex phenomenon, poverty 
maps showcasing multiple perspectives, such as ours, 
provide policymakers with better insights for effective 
responses for poverty eradication. Poverty maps at 
arrondissement and department levels, or at any spatial 
levels, will enable targeted policies for inclusive growth 
of all the regions in Senegal. The poverty maps 
generated using the behavioral indicators can be used 
to focus policies for certain demographics of the society 
that are specially vulnerable to poverty, such as women 
and specific ethnic groups. 
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